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ABSTRACT 

In this paper we investigate the performance of the likelihood ratio method as a tool 
for identifying optical and infrared counterparts to proposed radio continuum surveys 
with SKA precursor and pathfinder telescopes. We present a comparison of the in- 
frared counterparts identified by the likelihood ratio in the VISTA Deep Extragalactic 
Observations (VIDEO) survey to radio observations with 6, 10 and 15 arcsec resolu- 
tion. We cross-match a deep radio catalogue consisting of radio sources with peak flux 
density > 60 [iJy with deep near-infrared data limited to K s < 22.6. Comparing the 
infrared counterparts from this procedure to those obtained when cross-matching a 
set of simulated lower resolution radio catalogues indicates that degrading the resolu- 
tion from 6 arcsec to 10 and 15 arcsec decreases the completeness of the cross-matched 
catalogue by approximately 3 and 7 per cent respectively. When matching against shal- 
lower infrared data, comparable to that achieved by the VISTA Hemisphere Survey, 
the fraction of radio sources with reliably identified counterparts drops from ~89%, at 
K s <22.6, to 47% with K s <20.0. Decreasing the resolution at this shallower infrared 
limit does not result in any further decrease in the completeness produced by the 
likelihood ratio matching procedure. However, we note that radio continuum surveys 
with the MeerKAT and eventually the SKA, will require long baselines in order to 
ensure that the resulting maps are not limited by instrumental confusion noise. 
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INTRODUCTION 



Future radio continuum surveys with SKA pathfinder in- 
struments including the Evolutionary Map of the Universe 
(EMU; Norris et al. 2011), Westerbork Observations of the 
Deep APERTIF Northern-Sky (WODAN) and the surveys 
to be conducted with the Low-Frequency Array (LOFAR; 



* Based on observations collected at the European Organisation 
for Astronomical Research in the Southern Hemisphere, Chile, 
VIDEO: 179.A-2006 

f Based on observations obtained with MegaPrime/MegaCam, 
a joint project of CFHT and CEA/DAPNIA, at the Canada- 
Francc-Hawaii Telescope (CFHT) which is operated by the Na- 
tional Research Council (NRC) of Canada, the Institut National 
des Science de l'Univers of the Centre National de la Recherche 
Scientifique (CNRS) of France, and the University of Hawaii. This 
work is based in part on data products produced at TERAPIX 
and the Canadian Astronomy Data Centre as part of the Canada- 
France-Hawaii Telescope Legacy Survey, a collaborative project 
of NRC and CNRS. 



Rottgering et al. 2011) aim to map large areas of the sky (> 
1000 sq degrees) to very deep levels (5a ~ 50 ^tJy) with the 
purpose of addressing a number of key astrophysical ques- 
tions. Specifically they aim to constrain the cosmic star- 
formation history of the Universe out to z~2, to map the 
evolution of active galactic nuclei (AGN) to the edge of the 
universe and to probe the influence of postulated AGN feed- 
back mechanisms on star-formation activity and the forma- 
tion and evolution of galaxies (Norris et al. 2011). The wide- 
field and source density of these surveys may be particularly 
suited to cosmological studies (e.g. Raccanelli et al. 2011). 
On the other hand, smaller area surveys with the eMERLIN 
(Muxlow 2010), EVLA (e.g. Myers et al. 2010) and the fu- 
ture MeerKAT MIGHTEE surveys (Jarvis 2011) will push 
to much deeper flux densities (< 5/xJy rms) to obtain a clear 
census of activity in the Universe up to higher redshifts, tra- 
ditionally thought of as the realm of rest-frame optical and 
ultra-violet surveys. As radio observations are unaffected by 
dust extinction they have the potential to provide a dust- 
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unbiased probe of the accretion and star-formation history 
of the Universe at these redshifts. 

At flux densities greater than a few mjy radio surveys 
detect almost exclusively 'radio-loud' AGN whereas at lower 
flux density levels an increasing fraction of the radio source 
population is identified with star-forming galaxies. The rel- 
ative fraction of AGN and star-forming galaxies present at 
these low flux densities is not well-determined but obser- 
vational studies and extrapolations of the radio luminosity 
functions of both populations indicate that AGN constitute 
a significant fraction (up to 50%) of the radio source popula- 
tion even at levels of ten's of pjy (Sadler et al. 2002; Jarvis 
& Rawlings 2004; Simpson et al. 2006; Seymour et al. 2008; 
Kellermann et al. 2008; Padovani et al. 2009; 2011). Thus 
there is no observational regime in which one can assume 
a faint radio source is simply associated with a starburst 
galaxy, and complementary datasets at optical and infrared 
wavelengths will play a vital role in classifying these faint ra- 
dio sources. Furthermore, optical and near-infrared imaging 
are crucial for providing estimates of a radio source redshift 
through photometric redshift techniques or through follow- 
up spectroscopy at the position of the optical/near-infrared 
counterpart. 

Attaining the scientific goals of future deep continuum 
surveys is thus largely dependent on the ability to identify 
the correct multi-wavelength counterparts to the faint radio 
sources. Identifying the counterparts to a large fraction of 
higher redshift radio sources will require very deep comple- 
mentary multiwavelength datasets. Given the resolution of 
these planned radio surveys, 10 arcsec for EMU, 15 arcsec 
for WODAN, it is possible that this necessary increase in 
depth of complementary datasets will complicate the cross- 
identification process. The resolution and signal to noise ra- 
tio of the radio observations determines a lower limit on 
the positional accuracy of the faint radio source positions 
dictating that the true counterparts might be located any- 
where within this positional uncertainty. However the higher 
source density of the deeper complementary data creates an 
increased probability of both multiple counterparts and spu- 
rious alignments being detected within these large search 
radii, rendering nearest neighbour matching techniques un- 
reliable. 

A method which is often used to identify counterparts 
to low resolution radio observations is the Likelihood Ra- 
tio (LR) technique. This technique was first developed by 
Richter (1975) and later expanded upon by de Ruiter et 
al. (1977); Sutherland & Saunders (1992) and Ciliegi et 
al. (2003). The method combines information about the 
brightness distribution of the complementary higher resolu- 
tion data and the positional errors in both the radio source 
catalogue and the complementary dataset to determine the 
most likely counterpart. It is thus of interest to determine 
how this more sophisticated matching technique will per- 
form in the proposed case of low resolution (>10 arcsec) 
radio observations matched to high resolution very deep in- 
frared and optical catalogues. To investigate this question 
this paper will degrade the positional accuracy of a set of 
deep radio observations taken with 6 arcsec resolution and 
produce a number of simulated catalogues whose positional 
accuracies are consistent with observations taken at 10 arc- 
sec and 15 arcsec resolution. We use a LR analysis, similar 
to that in Smith et al. (2011), to determine reliable near- 



infrared (NIR) counterparts to both the original catalogue 
and the simulated 'low resolution' catalogues and compare 
the results. 

The paper is structured as follows: sections 2 and 3 out- 
line the radio and infrared observations used in the matching 
procedure. Section 4 gives the details of the LR technique 
used for matching, section 5 explains the procedure used to 
simulate the low resolution catalogue and section 6 gives the 
results of the comparison. Section 7 summarises the effects 
of blended radio sources and our conclusions are presented 
in section 8. All magnitudes are quoted in the AB magni- 
tude system. We assume that _Ho=70 km.s -1 .Mpc -1 and a 
f2M=0.3 and Qa=0.7 cosmology throughout this paper. 



2 RADIO OBSERVATIONS 

The radio survey used in this analysis consists of Very 
Large Array (VLA) observations at 1.4 GHz undertaken by 
Bondi et al. (2003). These observations were used to pro- 
duce a mosaic image with nearly uniform noise of ~17 /iJy 
over 1 square degree centred at a(J2000)=2 h 26 m 00 s and 
<5(J2000)=-4 d 30'00". They were taken in the VLA B- 
configuration and have a FWHM synthesised beamwidth of 
approximately 6 arcsec. 

A catalogue of radio sources was extracted from the 
mosaiced image using the AIPS (Greisen 2003) Search and 
Destroy (SAD) task, retaining only sources with a peak flux 
to local noise ratio of > 5. This procedure resulted in a cata- 
logue of 1054 radio sources whose peak flux densities exceed 
60 fiJy. Of these 1054 sources 19 are identified as multi- 
ple component radio sources. Radio sources are identified 
as multiple component sources if their individual compo- 
nents meet the following three criteria, the components are 
separated by < 18 arcsec, have peak flux ratios < 3, and 
all components have a peak flux > 0.4 mjy/beam. Further 
details of the calibration, catalogue extraction and multi- 
component classification procedures are outlined in Bondi 
et al. (2003). 

For the purpose of evaluating the performance of the 
LR technique we disregard the multiple component radio 
sources in the catalogue as the LR relies on knowledge of the 
expected position of the infrared counterpart source and the 
associated errors on this position, and both of these quan- 
tities are poorly determined in the case of multiple compo- 
nent radio sources. We also exclude single component radio 
sources whose morphologies are asymmetric or elongated as 
the position of the potential counterpart source is not well 
known in these cases. These criteria result in the exclusion 
of a further 3 radio sources from our input catalogue. We ac- 
knowledge that such sources will be important components 
of future wide-area radio surveys, however they are not key 
to the work presented here. 



3 INFRARED OBSERVATIONS 

The square degree of VLA radio observations has been ob- 
served with the VISTA Deep Extragalactic Observations 
(VIDEO; Jarvis et al. in prep.) Survey. The VIDEO sur- 
vey is a 12 sq. degree survey over three fields with the Vis- 
ible and Infrared Survey Telescope for Astronomy (VISTA) 
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Figure 1. Fit to the stellar locus used to remove stars from the 
infrared cross-matching catalogue. The solid line indicates our fit 
to the stellar locus in the combined VIDEO/CFHTLS-D1 dataset. 
The dashed line indicates the separation criteria applied and ob- 
jects below the dashed line were removed from the cross-matching 
catalogue. 



and is designed to investigate the formation and evolution of 
galaxies and galaxy clusters. The survey provides photom- 
etry in the Z,Y,J,H, K s bands to 5a depths of 25.7, 24.6, 
24.5, 24.0, 23.5 magnitudes (2 arcsec diameter apertures) re- 
spectively. This field also coincides with the Canada- France- 
Hawaii Telescope Legacy Survey (CFHTLS) Dl field which 
provides additional photometry in the u*,g' ,r' ,i' ,z' optical 
bands. 

To minimize the number of spurious faint sources in the 
combined catalogue used for cross-matching we retain only 
those sources with K s < 22.6 (Petrosian magnitude). We 
also disregard radio sources whose counterparts are affected 
by the presence of nearby saturated stars in the infrared 
images. 

As infrared photometry to the depths of the VIDEO 
survey will not be available for several years across the 
1000's of square degrees surveyed by future radio continuum 
surveys it is of interest to determine the likely complete- 
ness achieved when cross- matching these large radio surveys 
against infrared surveys with similarly large sky coverage. 
One of the largest near-infrared surveys currently being un- 
dertaken is the VISTA Hemisphere Survey (VHS) which is 
surveying the entire Southern Hemisphere to a 5cr depth 
of -R" s =20.0. The VHS thus represents one of the deepest 
complementary datasets covering the entire survey area of 
the wide-field EMU survey. We investigate the completeness 
that cross-matching to this survey will produce by perform- 
ing a LR cross-match to our VIDEO catalogue limited to 
detections with K s < 20.0 



3.1 Star-galaxy separation 

To remove contaminating stars from the combined 
NIR/optical catalogue, which are unlikely to be genuinely 
associated with radio sources, we employ a colour based cri- 
teria similar to that used by Baldry et al. (2010) in their 
star-galaxy separation algorithm for target selection in the 
GAMA survey. To achieve this separation we fit the stel- 



lar locus in (J-K)ab versus (g-i) ab colour space with a 
quadratic fi OC ws{x} given by: 

- 0.58 x < 0.4 
/ioc US (x)= - 0.88 + 0.82x - 0.21a; 2 for 0.4 < x < 1.9 

- 0.08 x > 1.9 

(1) 

Objects are removed from our cross-matching catalogue 
if their colours meet the criteria: 

J - K < 0.12 + /iocu S (3 - z) (2) 

Figure [1] illustrates the fit to the stellar locus and the sepa- 
ration criteria used. 

Another method to achieve star-galaxy classification is 
to determine how well an object is resolved in the opti- 
cal/NIR image, with stars and quasars being unresolved. 
Object detection and photometry for the VIDEO survey was 
achieved using SEXTRACTOR (Bertin & Arnouts 1996); full 
details of the extraction will be given in Jarvis et al. (in 
prep.). This package uses a neural network to assess how 
well resolved an object is and thereby determines a likeli- 
hood that the object is a star or galaxy. This likelihood is ex- 
pressed as a CLASS_STAR estimate between and 1.0 with 
stars having measurements close to 1.0 and galaxies close to 
zero. An inspection of these CLASS-STAR estimates reveal 
that the total contribution of stars to the VIDEO catalogue 
after implementing our colour threshold is <5%. We chose to 
use the criteria in equation [2] rather than a straightforward 
CLASS-STAR cut to avoid removing quasars, which are also 
unresolved in optical images and may be genuine counter- 
parts to the radio sources. However it should be noted that 
the criteria in equation [2] also removes a small number of 
quasars from our final NIR catalogue. 



4 LIKELIHOOD RATIO 

The likelihood ratio is the ratio of the probability that a 
given source and counterpart are related to the probabil- 
ity that they are unrelated. It is given by the relationship 
(Sutherland & Saunders 1992): 

n(m) 

where /(r) is the radial probability distribution function of 
the offsets between the radio and infrared positions, q(m) is 
the expected distribution of the true infrared counterparts 
as a function of AT s -band magnitude and n(m) is the magni- 
tude distribution of the full catalogue of i\s-band detected 
objects. 

The radial probability distribution f(r) is given by a 
Gaussian: 

/(r) = 2i exp Gi) (4) 

where r is the offset between the radio and infrared position 
and o-pos is the combined positional error of the radio and 
infrared sources. 

Positional errors of radio sources can be ascribed to two 
independent sources of error, calibration errors and noiselike 
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Figure 2. Comparison between the errors in the radio source 
positions calculated using the relationships in Condon (1997) 
crcondon and the error estimates used in our LR analysis c7i v i aon 
based on the relationships in Ivison et al. (2007). 



errors (Condon et al. 1998). Calibration errors are indepen- 
dent of source strength and are best estimated by a com- 
parison with external, more accurate data. In contrast, the 
noiselike contribution to positional errors is a function of the 
signal to noise ratio of the detection and is thus the domi- 
nant contributor to the positional errors of sources detected 
at low signal to noise. The noiselike positional errors of ra- 
dio sources are usually estimated from the models of Condon 
(1997) for the propagation of errors in 2-dimensional Gaus- 
sian fits in the presence of Gaussian noise. However in recent 
work Ivison et al. (2007) derived a simplified expression for 
the positional errors due to the Gaussian fits in the spe- 
cial instance of all the radio sources being unresolved by the 
symmetric beam. In this instance the positional errors can 
be described by the following equation: 

9n 



0.6- 



SNR 



(5) 



with SNR being the signal to noise ratio of the detected 
source. We adopt this simplified description of the expected 
positional errors in our derivation of the a pos . 

The final positional errors should also include an esti- 
mate of the calibration error term. As there are no other 
radio catalogues over this area of better positional accuracy 
with which to make a comparison, Bondi et al. (2003) chose 
to estimate the calibration error term by comparing the po- 
sition of the sources in the final mosaiced image with their 
positions in the images of the single VLA pointings. They 
find on the basis of this comparison that these calibration 
errors are of the order of 0.1 arcsec. Thus we adopt as the 
expected positional error: 



= O.I 2 + al t 



(6) 



We justify this simplified description of the errors based 
on figure [2] which presents a comparison of these simplified 
estimates with those predicted by the method of Condon 
(1997) and demonstrates that they are in reasonable agree- 
ment. Furthermore to account for the possibility that the 
radio and observed-frame K s band emission may not arise 
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Figure 3. The distribution of real(m), total(m) and q(m) calcu- 
lated by the LR in our cross-matching procedure. 



at exactly the same position in the galaxy we impose the 
restriction that <T pos > 0.5 arcsec. 

The n(m) term of the likelihood ratio is estimated from 
the source counts of the input VIDEO catalogue normalised 
to the area of the survey. 

The most difficult term to estimate in the LR is q(m), 
the magnitude distribution of the true counterparts to 
the radio sources. This distribution is estimated using the 
method outlined in Ciliegi et al. (2003) which begins by 
calculating the magnitude distribution of all the possible 
counterparts within a fixed search radius r max of the radio 
positions. This distribution is referred to as total(m). The 
contribution due to the background source counts is sub- 
tracted from total(m) to produce a magnitude distribution 
of the excess infrared sources detected around the radio po- 
sitions, designated as real(m). Thus 

real(m) = total(m) — (n(m) * A ra dio * n * r max ) (7) 

, is the total number of radio sources in the input 



where N rBjdio 
catalogue. 

The q(m) distribution is derived from real(m) by nor- 
malising real(m) and scaling it by a factor Qo, where Qo 
is an estimate of the fraction of radio sources with infrared 
counterparts above the magnitude limit of the VIDEO sur- 
vey. Hence: 



q(m) = 



real(m) 
E m real(m; 



x Qo 



(8) 



The Qo term is usually estimated by determining the 
fraction of sources with radio counterparts above the back- 
ground as follows: 



Qo 



^matches - (Em n ( m ) X ^max X A ra dio) 



(9) 



Aradio 

where A m atches is the number of possible counterparts within 
''max of the radio positions. However in the case of a large 
r m ax we find that this expression leads to an overpredic- 
tion of the Qo value as a result of a large number of excess 
sources in the search radii above the predicted background 
source counts. This effect may result from the tendency for 
radio-loud AGN to favour denser environments than normal 
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Table 1. Comparison of Qo estimates using the methods of 
Ciliegi et al. (2003) and Fleuren et al. (2011). 







6 arcsec 


10 arcsec 


15 arsec 


K s < 22.6 


Qo (eqn|9j 


1.03 


1.19 


1.3 




Qo (eqnrrnj 


0.90 


0.90 


0.90 


K s < 20.0 


Qo (eqn|9j 


0.51 


0.55 


0.57 




Qo (eqnrrnj 


0.49 


0.49 


0.49 



galaxies (Best et al. 2005; Falder et al. 2010), resulting in a 
large number of close neighbours for these sources. To over- 
come this difficulty we use the method developed by Fleuren 
et al. (submitted) to estimate Qo. 

This method determines the number of sources in the 
radio catalogue which have no NIR counterparts within 
search radius r of the radio positions as a function of the 
size of the search radius. This function U bs(r) is compared 
to i7 rant iom(T) which is the number of sources with no NIR 
counterparts within search radius r when considering an in- 
put catalogue with the same number of sources as our orig- 
inal radio catalogue but with sources placed at random po- 
sitions in the field of view. Fleuren et al. demonstrate that 
the U bs{r) and U T!1 ,ndom(r) quantites can be related to Qo 
via the following equation: 

Uoba(r) 



U r 



x(r) 



= 1 - QoF(r) 



(10) 



where: 



F(r) = f P{r')dr' = 1 
Jo 

P(r) =2nrf(r) 



e2<r 



(11) 



Thus we determine Qo via a fit to the observed ratio 
of Uobs(r) to Urandom^). We present a comparison of the 
Qo estimates produced by the two methods in Table [T] and 
adopt the estimates produced by the Fleuren et al. method 
in our LR matching procedures at all resolutions. 

The presence or absence of more than one infrared coun- 
terpart for a particular radio source provides extra informa- 
tion to that contained in the LR itself which can then be 
used to estimate the reliability of the counterpart source, or 
the probability that a particular source is the correct coun- 
terpart. The reliability is calculated as: 

T R 

Reli = E jL R J + (l-Qor (12) 

where j is the index of all the possible counterparts to the 
radio source. We accept sources with Reli >0.8 as being a re- 
liably identified counterpart to the radio source. We are then 
able to estimate the number of contaminating false identi- 
fications iV con t in our catalogue of reliable identifications as 
being 



(13) 



5 SIMULATED CATALOGUE 

This study aims to determine whether the LR cross- 
matching procedure will allow us to reliably identify a large 



fraction of the counterparts to the radio sources detected in 
future radio continuum surveys including EMU, MIGHTEE 
and WODAN. We aim to compare the performance of this 
technique at 6 arcsec resolution with those of observations 
with 10 and 15 arcsec resolution, which are the proposed res- 
olutions of the EMU and WODAN surveys respectively. To 
simulate the degradation of positional accuracy in the Bondi 
et al. (2003) catalogue which would take place if these ob- 
servations were performed to the same depth with a larger 
synthesised beam we add Gaussian scatter to the positions of 
the radio sources in the catalogue in line with the theoretical 
predictions of the equations [S] and fB] in section 2] We gener- 
ate 100 simulated 'low resolution' catalogues with simulated 
FWHM beamwidths of 10 and 15 arcsec. The limitation of 
this approach is that it precludes us from studying the in- 
stances where close pairs of radio sources merge within the 
larger synthesised beam and the impact this blending may 
have on our ability to make reliable counterpart identifica- 
tions. We attempt to estimate the effect of blended radio 
sources separately from our LR analysis in section 7. 



6 NEAR-INFRARED COUNTERPARTS TO 
RADIO SOURCES 

We use the LR technique to find counterparts for both the 
original VLA radio catalogue and the two sets of 100 sim- 
ulated catalogues with nominal FWHM beamwidths of 10 
and 15 arcsec. We first match the radio sources to almost 
the full depth of the VIDEO catalogue with K B < 22.6 and 
then repeat this procedure with the VIDEO catalogue re- 
stricted to detections with K B < 20.0. In order to ensure 
that our search radius includes all possible real counterparts 
to the radio sources we set r max to 5 times the largest ex- 
pected positional error cr pos at each of the three resolutions 
considered in this study. This results in an r max of 3.6, 6.0 
and 9.0 arcsec in the LR analysis of the 6,10 and 15 arcsec 
catalogues respectively. The /(r) term of the LR is also ad- 
justed to account for the increased positional uncertainty in 
the lower resolution catalogues. A summary of the relevant 
parameters used in the LR analysis at the three different 
resolutions when matching against the deeper and shallower 
near-infrared catalogue is given in Tables [2] and [3] respec- 
tively. For the simulated catalogues these tables contain the 
mean and standard deviation of the 100 LR matching pro- 
cedures performed at each resolution. 



6.1 Counterparts as a function of resolution 

An inspection of Table [2] reveals that when cross- matching 
against the full VIDEO catalogue the number of sources 
with reliable counterparts decreases with decreasing reso- 
lution. We identify 915, 887 and 838, sources with reliable 
counterparts at 6,10 and 15 arcsec resolution, with each de- 
crease in resolution resulting in a loss of approximately 3% 
and then a further 4% of the identifications. The complete- 
ness as a function of flux density for all three resolutions is 
plotted in figures [4] and [5] These figures indicate that the 
number of lost identifications increase at lower flux densi- 
ties where the lower signal to noise ratio of the detections 
result in larger positional uncertainties. For clarity figure [5] 
presents a close-in view of the completeness at the fainter 
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Table 2. Summary of relevant parameters in the LR analysis of 
the 6, 10 and 15 arcsec catalogues when matching against the 
VIDEO catalogue with K B < 22.6. N(Rel>0.8) and % Rel>0.8 
represents the number and percentage of radio sources which have 
counterparts with Rel>0.8. The total number of radio sources in 
the input catalogue is 1031. Similarly N cont and % cont represent 
estimates of the number and percentage contribution of misiden- 
tified contaminating sources. 





6 arcsec 


10 arcsec 


15 arcsec 


)"max [arcsec] 


3.6 


6.0 


9.0 


Qo 


0.90 


0.90 


0.90 


iV(Rel>0.8) 


915 


887.32 ± 5.91 


837.91 ± 7.18 


*>na match 


68 


50.9 ± 1.71 


32.63 ± 1.75 


N < r max 


1274 


1809.34 ±9.36 


2669.30 ± 12.61 


JVcont 


6.825 


12.405 ±0.809 


19.346 ± 0.93 


% Rel>0.8 


88.7% 


86.0 ± 0.6% 


81.8 ± 0.7% 


% cont 


0.74% 


1.4 ± 0.1% 


2.3 ± 0.1% 



Table 3. Summary of relevant parameters in the LR analysis 
of the 6, 10 and 15 arcsec catalogues with K B < 20.0. The row 
headings are as in Table [2] 





6 arcsec 


10 arcsec 


15 arcsec 


rmax [arcsec] 


3.6 


6.0 


9.0 


Qo 


0.49 


0.49 


0.49 


iV(Rel>0.8) 


486 


490.31 ± 3.65 


485.09 ± 5.08 


^no match 


510 


484.01 ± 2.22 


437.53 ± 3.30 


N < I'max 


567 


645.91 ±3.02 


775.81 ± 5.19 


A'cont 


3.931 


6.839 ±0.555 


11.062 ± 0.708 


% Rel>0.8 


47.1% 


47.5 ± 0.6% 


47.0 ± 0.5% 


% cont 


0.80% 


1.4 ± 0.1% 


2.3 ± 0.1% 



flux densities (< lmjy) and the greyscale filled regions in 
this figure indicate the lcr variation between our 100 simu- 
lated catalogues at each resolution. 

Encouragingly, our results indicate that at 6 arcsec reso- 
lution we are able to identify nearly all the available counter- 
parts whose magnitudes are less than the imposed K B < 22.6 
magnitude limit as the fraction of reliably identified sources 
89% is very close to our estimated Qo value of 0.90. Our 
estimates also indicate that the contribution of contaminat- 
ing or misidentified sources is very low at ~ 0.7%. Table [2] 
also reveals that our estimate of the number of contami- 
nating sources in our cross-matched catalogue increases at 
lower resolution to 1.4% and 2.3% at resolutions of 10 and 
15 arcsecs. 

A subsection of the radio data in this paper has previ- 
ously been matched to deep K— band data using a LR pro- 
cedure (Ciliegi et al. 2005). This matching was performed to 
a A'— band depth of 23.9 over a 165 arcmin 2 field observed 
by Iovino et al. (2005), the limiting magnitude used in the 
matching procedure corresponds to the 50 per cent com- 
pleteness limit of the Iovino survey. Ciliegi et al. 2005 find a 
total of 43 reliable K-band matches to the 65 radio sources 
located within this subfield, corresponding to a complete- 
ness of ~ 66 per cent which is significantly lower than the 
88.7 per cent completeness achieved in this work. We ascribe 
this improvement to the greater depth of the VIDEO sur- 
vey, which has factor of ~12 greater integration time over 
the Iovino catalogue with a telescope of similar aperture. 
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Figure 4. The fraction of reliable counterparts detected at 6, 10 
and 15 arcsec resolution when matching against the VIDEO NIR 
catalogue restricted to detections with K B < 22.6 and K B < 20.0. 
The greyscale bands represent the lcr Poisson error on the cross- 
matched fractions. 
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Figure 5. Close-in plot of the fraction of reliable counterparts 
detected for the faint radio sources (< 1 mjy) at 6,10 and 15 arc- 
sec resolution when matching against the VIDEO NIR catalogue 
restricted to detections with K B < 22.6. The greyscale filled bands 
represent the lcr variation between the 100 simulated low resolu- 
tion radio catalogues and do not include the Poisson errors. 



Furthermore the q(m) distributions and LR in Ciliegi et al. 
(2005) are derived from the VVDS optical catalogues (Mc- 
Cracken et al. 2003) available over the whole 1 square degree 
radio field. As q{m opt ) distributions are not precisely equiva- 
lent to q{m,NiR) it is likely that their use of the optical mag- 
nitude distribution in the matching procedure contributes to 
an underestimate of the significance of some of the fainter 
NIR matches. 



6.2 Counterparts as a function of near-infrared 
magnitude 

In the case of matching against the VIDEO catalogue lim- 
ited to the depth of the VHS, Table [3] reveals a similar in- 
creasing trend in the number of contaminating sources with 
decreasing resolution from 0.8 per cent at 6 arcsec to 1.4 and 

2.3 per cent at the lower resolutions. However the complete- 
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Figure 6. The fraction of reliably associated counterparts as a 
function of K B magnitude at 6 arcsec resolution for the depth 
of our radio imaging data. The greyscale bands represent the 1<t 
Poisson error on the cross-matched fraction. 

ness of the cross-matched catalogue is nearly identical at 
all three resolutions, indicating that the depth of the com- 
plementary near-infrared data is a more relevant limiting 
factor at these shallower survey depths than radio survey 
resolution. This trend can be understood by examining the 
middle plot in figure which indicates that NIR counter- 
parts with magnitudes lower than K s < 20.0 are assigned 
higher q(m)/n(m) fractions than fainter NIR matches. The 
intrinsic rarity of brighter NIR sources thus increases the 
significance of these bright NIR matches allowing us to par- 
tially overcome the limitation of poorer positional accuracy. 
In contrast at deeper NIR magnitudes the increasing den- 
sity of faint sources dictates that resolution, or equivalently 
positional accuracy, is increasingly relevant in determining 
the correct counterpart. 

In figure [6] we show a plot of the fraction of reliably 
identified counterparts as a function of K s band magni- 
tude. This demostrates that deep near-infrared and/or op- 
tical data are crucial for successfully identifying faint radio 
sources, at least to the depth of our radio imaging data and 
this will only become more of an issue for yet deeper ra- 
dio imaging such as those planned with the MeerKAT (e.g. 
Jarvis 2011). We note that once again the reliably identified 
fraction of radio sources is very close to the Qo estimate 
indicating that we are identifying nearly all the available 
counterparts with the NIR catalogue to K s < 22.6. 

6.3 Mis-identified counterparts at low resolution 

Apart from considering changes to the overall completeness 
with resolution it is also of interest to determine whether 
there are differences between the low and high resolution 
catalogues in terms of the subset of radio sources that 
have reliably identified counterparts and whether these radio 
sources are associated with the same NIR counterpart in all 
cases. Changes in the exact composition of the output cross- 
matched catalogues occur because the Gaussian scatter in- 
troduced to the positions of the simulated low resolution ra- 
dio sources will alter their relative position to any possible 
near-infrared counterparts, furthermore the lower resolution 
catalogues have larger positional uncertainties <T pos . These 



Table 4. Summary of differences between the cross-matched cat- 
alogues created when matching against the original 6 arcsec reso- 
lution catalogue and the simulated low resolution catalogues at 10 
and 15 arcsec. This table summarises the differences when match- 
ing against the VIDEO NIR catalogue limited to K s < 22.6. 





10 arcsec 


15 arcsec 


^losc 


41.80 ± 4.78 


94.10 ± 6.36 


^gain 


14.12 ± 2.89 


17.00 ± 3.02 


^common 


873.20 ±4.78 


820.91 ± 6.36 


^diff id 


5.10 ±2.59 


9.97 ±3.22 



Table 5. Summary of differences between the cross-matched cat- 
alogues created when matching against the original 6 arcsec res- 
olution catalogue and the simulated low resolution catalogues at 
10 and 15 arcsec resolution. This table summarises the differ- 
ences when matching against the VIDEO NIR catalogue limited 
to K s < 20.0. 





10 arcsec 


15 arcsec 


^losc 


9.71 ± 2.69 


23.62 ± 3.86 


^gain 


14.02 ± 2.42 


22.72 ± 2.86 


^common 


476.30 ±3 


462.37 ± 3.85 


Rdiff id 


0.62 ±0.82 


2.02 ±1.36 



two factors result in changes to the f(r) term of the LR 
and consequently alter the overall statistical significance of 
a match between any pair of sources. Changes in the cross- 
matched low resolution catalogues compared to the original 
VLA cross-matched catalogue occur in three different forms, 
radio sources with secure identifications at 6 arcsec no longer 
have secure identifications in the low resolution catalogue, 
we refer to these as Rioae, radio sources with no reliable coun- 
terparts at 6 arcsec have an identified counterpart at lower 
resolution R ga in- Radio sources with secure identifications 
at both resolutions R common are identified with a different 
NIR counterpart at different resolutions Rdiff id- The aver- 
age changes in the composition of the output catalogues are 
listed in Tables [4] and O these are relative to the original 
VLA cross-matched catalogue in all cases. These tables in- 
dicate that differences in the exact composition of the out- 
put cross-matched catalogues at the three different resolu- 
tions are usually small. When matching against the deeper 
VIDEO catalogue 95 and 88 per cent of the radio sources 
in the original cross-matched catalogue are identified with 
the identical NIR counterpart in the 10 and 15 arcsec cata- 
logues. The catalogue with shallower NIR magnitude limits 
produced even fewer discrepancies with 98 and 95 per cent of 
the identifications remaining unchanged at lower resolution. 
This clearly illustrates the increased difficulty in associat- 
ing the radio sources with their correct counterparts when 
matching against very deep complementary datasets. 

6.4 Redshift distributions of identified radio 
sources 

Photometric redshifts for the combined VIDEO and 
CFHTLS-D1 datasets have been derived using SED fitting 
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Figure 7. The photometric redshift distribution of the counter- 
parts in the matched catalogues restricted to sources with mag- 
nitudes K s < 22.6 and K s < 20.0. 

techniques with the photometric redshift package Le Phare 
(Arnouts et al. 1999; Ilbert et al. 2006), based on simula- 
tions these redshifts have accuracies of o z ~0.1 for sources 
with K s < 22.6, and for a small sample of real objects with 
spectroscopic redshifts from the VVDS survey (Le Fevre et 
al. 2007) cr z ~0.095. Figure [7] presents the redshift distri- 
bution of sources matched at 6 arcsec to almost the full 
depth of the VIDEO survey with those restricted to matches 
with K s < 20.0. Unsurprisingly there is clear evidence for 
a decline in the fraction of high redshift sources detected 
in the survey with shallower magnitude limits. It is clear 
that while the VHS survey will allow us to identify a sig- 
nificant fraction, ~ 83 per cent, of the sources with z< 1.2 a 
large fraction of the radio sources at redshifts higher than 
this threshold will not be present in this shallower wide-field 
survey. The histogram indicates that only 14 per cent of the 
identified counterparts at z > 1.2 had magnitudes brighter 
than the A' s < 20.0 limit. There is also evidence for a loss of 
a significant (~40 per cent) number of counterparts in the 
1< z< 1.2 redshift bin. 



7 BLENDED SOURCES 

As mentioned in section[5l our LR investigation makes no at- 
tempt to consider the possible effects of an increased number 
of blended sources at lower resolutions on the completeness 
produced by the LR technique. To estimate the increase in 
the number of radio sources which will be blended by the 
beam in 10 and 15 arcsec images, compared to the 6 arcsec 
resolution image, we use the size and spatial distribution of 
radio sources predicted in the Square Kilometer Array Simu- 
lated Skies (Wilman et al. 2008; 2010). From this simulation 
we extract a list of radio source components with Si^ghz 
greater than the predicted 5cr flux density limit of the EMU 
survey of 50 ^iJy over a 1 square degree field of view. The 
size of the radio components are adjusted to simulate the 
effect of convolution with a Gaussian beam of FWHM size 
of 6, 10 and 15 arcsec respectively. 

Separating close components in an astronomical image 



Table 6. The number of blended sources detected at 6, 10 
and 15 arcsec resolution predicted by the SKA Simulated Skies 
(Wilman et al. 2008; 2010). Column 1 indicates the number of 
components blended per detected source. 



Number of components 


6 arcsec 10 arcsec 15 arcsec 


unrelated blends 


2 


22 36 62 


multiple component blends 


2 


218 218 219 


3 


41 41 41 


4 or more 


2 4 4 



is a complex problem and source extraction packages adopt 
a variety of approaches to this task. For instance the MPS 
SAD task attempts to separate emission features into mul- 
tiple components based on the level of residual flux present 
after a single component Gaussian fit, whereas SEXTRACTOR 
maps the detected emission at a number of sub-thresholds to 
detect junctions in the emission profile of the blended source 
(Bertin & Arnouts 1996). Consequently the probability of 
separating a pair of close sources is difficult to quantify and 
depends on the characteristics of the pair including their sep- 
aration, relative angular sizes and peak fluxes as well as the 
details of the deblending technique in question. To obtain an 
estimate of the fraction of blended pairs in our catalogues 
we make the simplifying assumption that a source extrac- 
tion algorithm will be unable to separate a pair of sources 
if their separation is less than the mean of their FWHM. 
In this determination we disregard the contribution of very 
extended sources (> 17 arcsec) as these should be fully re- 
solved by the beam, these extended sources constitute less 
than 5 per cent of the total source population at our chosen 
flux density limits. 

To confirm that this simplification is reasonable we cre- 
ated maps of the simulated radio source components at 6, 10 
and 15 arcsec resolution using the Simulated Skies S3Map 
tool (Levrier et al. 2009). The MPS SAD task was used to 
extract a source list from these images and an inspection of 
the output component lists confirmed that the the pairs se- 
lected using our separation criteria were detected as a single 
Gaussian component by the SAD algorithm. Based on our 
separation criteria and an inspection of the SAD outputs 
we determine the number of detected radio sources in the 
simulated field which consist of a blend of two or more un- 
derlying radio source components. We consider separately 
detected radio sources which consist of a blend of multiple 
radio components arising from the same radio galaxy (i.e. 
radio lobes from an FRI/FRII source blended together) and 
sources which are a blend of unrelated radio galaxies, the 
results of our analysis are presented in Table [(J] 

The initial simulated component list consisted of 1908 
components with Si.4GHz > 50 /iJy, these are reduced to 
1579, 1557 and 1531 detected sources of which 283, 299 and 
325 detections are blended sources in the 6, 10 and 15 arc- 
sec catalogues respectively. As the LR is not designed to 
account for the possibility of two or more counterparts per 
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radio source it is reasonable to assume that sources which 
consist of a blend of unrelated radio galaxies will not be 
reliably associated with an appropriate counterpart. The 
total contribution of these unrelated blends increases from 
1.4 per cent at 6 arcsec resolution to 2.3 and 4.0 per cent 
at 10 and 15 arcsec resolutions respectively. Thus we con- 
clude that the contribution of this effect to incompleteness 
in any LR based cross-matching routine in future surveys 
will be small, at the level of approximately 1~2.5 per cent. 
The real contribution of this effect will depend on the base- 
line distribution and uv coverage of the survey in question. 
We do not attempt to estimate to what extent the blended 
multiple component radio sources will contribute towards in- 
completeness when cross-matching as the symmetric nature 
of the SKADS simulation radio sources will not allow us to 
realistically determine to what extent the positions of the 
final blended radio sources will deviate from the expected 
position of the NIR counterpart. 



8 TOWARDS DEEPER SURVEYS WITH 
MEERKAT 

Although in this paper we have concentrated on our ability 
to cross-match radio continuum sources from surveys such 
as those proposed for ASKAP and APERTIF, it is clear that 
similar techniques may also be appropriate for much deeper 
(~ 100 njy) and narrower surveys such as the MeerKAT 
International Giga-Hertz Tiered Extragalactic Exploration 
(MIGHTEE) Survey (Jarvis 2011). Although we currently 
do not have the necessary data to test how well we can re- 
cover counterparts to the radio sources at these depths, it is 
clear that our ability to identify shorter wavelength counter- 
parts becomes worse towards fainter flux densities (e.g. Fig- 
ure [S| due to both the declining signal-to-noise and increas- 
ing density of counterpart sources. However resolution of the 
radio maps is also a contributing factor. The issue of spatial 
resolution will only become more difficult to deal with at 
deeper flux densities due to the edging closer to the classical 
confusion level of ~25 beams per source(Condon 1974). The 
current design of MeerKAT, incorporating 20 km maximum 
baselines, will provide a spatial resolution of ~ 3 arcsec at 
1.4 GHz which for a 10/iJy flux-density limit corresponds to 
~ 40 beams per source according to the simulated skies of 
Wilman et al. (2010). Thus as we approach the new pa- 
rameter space in flux-density and survey area offered by 
MeerKAT, and eventually the SKA, the crucial aspect of 
telescope design will be how to achieve ~ 1 arcsec resolu- 
tion coupled with high surface brightness sensitivity. 



9 CONCLUSIONS 

We have presented a comparison of the infrared counter- 
parts identified by the LR technique to radio sources ob- 
served with synthesized beamwidths of 6, 10 and 15 arcsec 
resolution when matched against a NIR catalogue at depths 
of K B < 22.6 and 20.0. The results of our analysis indicate 
that we are able to reliably associate nearly all the available 
radio source counterparts in the NIR catalogue, limited to 
K s < 22.6, with the appropriate radio source. Furthermore 



~93 and 88 per cent of the identifications made by this tech- 
nique remain unchanged when matching the lower resolution 
10 and 15 arcsec catalogues to the deeper NIR catalogue. At 
all resolutions the technique delivers a catalogue with a high 
degree of completeness and a low percentage of contaminat- 
ing misidentified sources. When matching against the shal- 
lower NIR catalogue the fraction of unchanged counterparts 
increases to 97 and 94 per cent in the 10 and 15 arcsec cases. 

Although changes in completeness and contamination 
fractions in our study were all relatively small as a function 
of resolution it is clear that both of these quality indicators 
degrade systematically as the depth of the matching com- 
plementary data increases and the resolution of the radio 
data decreases. 

We conducted a brief investigation into the question of 
unrelated radio sources being blended together at lower res- 
olutions and reducing the completeness achieved by the LR 
cross- matching technique. We conclude that at a flux density 
limits of ~50 /iJy, comparable to the EMU wide-field sur- 
vey, the contribution of this effect is likely to be small at the 
level of only a few percent. Finally the distribution of pho- 
tometric redshifts in our matched catalogues indicates that 
a significant fraction of the radio sources at z>1.2 will not 
be detected in the shallower wide-field VISTA Hemisphere 
survey. 
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